Upon successful completion of this course the students will be able to: ::: fragment * execute basic programming tasks in R (e.g. loops, conditional statements, while statements, etc.) ::: ::: fragment * understand basic GIS terms and concepts ::: ::: fragment * utilize GIS for conducting spatial analyses. ::: ::: fragment * appreciate the design and structure of a geographic information system (GIS) as a decision-making tool. ::: ::: fragment * produce maps :::
You will be graded on four problem sets during the semester (each 12.5% of your grade) and a final report and presentation (each 25% of your grade). ::: fragment * 4 problem sets: 50% of the final grade (12.5% each) * final presentation: 25% of the final grade * final project report: 25% of the final grade :::
You will undertake a GIS project that emphasizes practical application of spatial analysis techniques using the R programming language
The project entails a few steps:
In the first part of the course, we will learn about the R programming language and its capabilities with respect to spatial data
The first lectures will be dedicated to acquire the basic knowledge to work with spatial data bit also with R
We will then move to work with spatial data in R, including how to process: vectors, rasters, and combine the two
In the final part, we will also learn how to deal with spatio-temporal data and point pattern analysis
R is a programming language originally designed for statistical computing
It is an open-source ecosystem (i.e. everyone can contribute and it’s free)
It is compatible with Windows, Mac, and Linux
A variety of libraries already exist which allow you to do easy things like:
This is how R compares to other programming languages
R is used in a variety of fields:
Examples of companies which use R include
GIS stands for Geographic Information Systems
GIS is a system that that creates, manages, analyzes, and maps all types of data
It helps us understand patterns, relationships, and geographic context
It can be used to:
Mapping focuses on the visual representation of data
Spatial analysis focuses on a variety of aspects:
GIS comprises both mapping (visualization) and geographic data manipulations and analysis
GIS Software
R is good for:
Reading and writing spatial data into R is done through external libraries
sfsf will be the main library that we will work with
It will help us deal with:
sf: bufferstarsWe can perform geometric operation on rasters (pictures) with the stars package
starsTemperature in 1901
starsTemperature in 2022
starsTemperature difference between 2022 and 1901 > 4
geosphereAdd more projects here
ggplot2 is the library that will allow to visualize data analysis results, but also to make mapsleaflet is a library that allows us to make interactive mapsmapview is a wrapper around leaflet automating the addition of: labels, popups, color scales, and common basemapsExample:
We will see that everything that we work with in R is an object
For example, we can load up a geojson file in R.
#Step1: Loading the geojson file
restaurants <- geojson_sf("/Users/bgpopescu/Library/CloudStorage/Dropbox/john_cabot/teaching/big_data/week13/data/restaurant.geojson")
#Step2: Selecting only the relevant variables
restaurants<-subset(restaurants, select = c(name, `addr:street`))
#Step3: Removing the restaurants without a name or without an address
restaurants2<-subset(restaurants, !is.na(restaurants$name) | !is.na(restaurants$`addr:street`))R transforms the geojson file into an object of a class named sf data.frame
This type of object has numerous properties such as:
Coordinate Reference System:
User input: 4326
wkt:
GEOGCS["WGS 84",
DATUM["WGS_1984",
SPHEROID["WGS 84",6378137,298.257223563,
AUTHORITY["EPSG","7030"]],
AUTHORITY["EPSG","6326"]],
PRIMEM["Greenwich",0,
AUTHORITY["EPSG","8901"]],
UNIT["degree",0.0174532925199433,
AUTHORITY["EPSG","9122"]],
AXIS["Latitude",NORTH],
AXIS["Longitude",EAST],
AUTHORITY["EPSG","4326"]]
Once imported, the sf data.frame is saved in the computer memory
Printing the object will display some of its properties and specific properties
Simple feature collection with 2811 features and 2 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 12.21167 ymin: 41.70574 xmax: 12.77428 ymax: 42.06974
Geodetic CRS: WGS 84
First 10 features:
name addr:street
1 Pizzeria ai Marmi Viale di Trastevere
2 Sichuan Haozi Via di San Martino ai Monti
3 Dar filettaro a Santa Barbara Largo dei Librari
4 Al Peperoncino Via Ostiense
6 Ai Tre Scalini Via Panisperna
7 Trattoria Ada e Mario Circonvallazione Appia
8 Gustosando <NA>
9 Sa Posada Via Elvia Recina
10 Pizzeria Formula 1 Via degli Equi
11 Da Francesco Piazza del Fico
geometry
1 POINT (12.47379 41.88826)
2 POINT (12.49948 41.8958)
3 POINT (12.4737 41.89467)
4 POINT (12.47698 41.85343)
6 POINT (12.49044 41.89628)
7 POINT (12.51433 41.87532)
8 POINT (12.42743 41.89954)
9 POINT (12.5079 41.87995)
10 POINT (12.51268 41.89702)
11 POINT (12.4704 41.89932)
By printing the object, we can see some of its properties including:
dimension
bounding box
crs
One of the characteristics of object oriented programming is inheritance
Inheritance is what makes it possible for one class to extend to another class, by adding other properties
Example:
A “taxi” is an extension of a “car” class, inheriting all of its properties and methods.
A taxi could have new properties like taxi company name
In R, every complex object is a collection of smaller components such a properties
We can use str to examine the properties of the class
Classes 'sf' and 'data.frame': 2811 obs. of 3 variables:
$ name : chr "Pizzeria ai Marmi" "Sichuan Haozi" "Dar filettaro a Santa Barbara" "Al Peperoncino" ...
$ addr:street: chr "Viale di Trastevere" "Via di San Martino ai Monti" "Largo dei Librari" "Via Ostiense" ...
$ geometry :sfc_POINT of length 2811; first list element: 'XY' num 12.5 41.9
- attr(*, "sf_column")= chr "geometry"
- attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA
..- attr(*, "names")= chr [1:2] "name" "addr:street"
For example, the names of the restaurants are stored as a string variable called name (second line of output)
The addresses of the restaurants are stored as a string variable called addr:street (second line of output)
#Data Cleaning
clean_countries<-subset(life_expectancy2, !(Code %in% weird_labels))
clean_countries_urbanization<-subset(urbanization2, !(Code %in% weird_labels))
#Data Cleaning
clean_countries<-subset(life_expectancy2, !(Code %in% weird_labels))
clean_countries_urbanization<-subset(urbanization2, !(Code %in% weird_labels))
#Left Join
new_data<-left_join(clean_countries, clean_countries_urbanization, by = c("Code"="Code"))#Data Cleaning
clean_countries<-subset(life_expectancy2, !(Code %in% weird_labels))
clean_countries_urbanization<-subset(urbanization2, !(Code %in% weird_labels))
#Data Cleaning
clean_countries<-subset(life_expectancy2, !(Code %in% weird_labels))
clean_countries_urbanization<-subset(urbanization2, !(Code %in% weird_labels))
#Left Join
new_data<-left_join(clean_countries, clean_countries_urbanization, by = c("Code"="Code"))We will now familiarize ourselves with the R environment
We first need to install R: R-project
We will then install an R interface that allows us to interact with R in a more user-friendly manner: R-studio
Popescu (JCU): Lecture 1